15. Merging Datasets

Merging Datasets

1. Rename 2008 columns to distinguish from 2018 columns after the merge

To do this, use pandas' rename() with a lambda function. See example here . In the lambda function, take the first 10 characters of the column label and and concatenate it with _2008 . (Only take the first 10 characters to prevent really long column names.)

The lambda function should look something like this: lambda x: x[:10] + "_2008"

In your rename , don't forget to specify the parameter columns= when you add the lambda function!

2. Perform inner merge

To answer the last question, we are only interested in how the same model of car has been updated and how the new model's mpg compares to the old model's mpg.

Perform an inner merge with the left on model_2008 and the right on model . See documentation for pandas' merge here .

Workspace

This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.

Workspace Information:

  • Default file path:
  • Workspace type: jupyter
  • Opened files (when workspace is loaded): n/a

How many columns are in your new merged dataset?

SOLUTION: 26